--- title: "[R] GIS like it's 2019: Pragmatic Workflows for Vector and Raster Data, Pt. 1" author: Ilja / fubits date: '2019-01-29' slug: r-gis-like-it-s-2019-pragmatic-workflows-pt-1 categories: - GIS - Mapping - Open Data tags: - sf - raster - mapview - mapedit - osmdata output: blogdown::html_page: number_sections: yes toc: yes lastmod: '2019-01-29T21:21:12+01:00' description: 'GIS with R is beyond powerful. This is part 1 of a collection of up-to-date workflows for processing spatial data of all kinds with R. The collection involves sf, mapview, mapedit, osmdata, raster, stars, rnaturalearth and other packages combined with straight-forward recipes.' abstract: '' draft: FALSE thumbnail: /img/thumbs/gis_workflows_1.jpg rmdlink: yes keywords: [] comment: no autoCollapseToc: no postMetaInFooter: no hiddenFromHomePage: no mathjax: no mathjaxEnableSingleDollar: no mathjaxEnableAutoNumber: no ---
If you’re just here for the code / hacks, feel free to skip this intro and jump directly to the hands-on part.
To be precise, my R “career” only started to gain traction when I was asked to create a low-budget CC0 print map by a colleague almost one and a half years ago. This particular use case gave me enough of an impression of the awesomeness of the Rstats / FOSS-minded community, the holistic potential of R re: all things data, and in particular the #GIStribe’s continuous efforts to make spatial processing with R accessible to everyone. What I did not know back then (among maaaaany other things) was that there was already existing a dual system of Base R and Tidy / Tidy-fied R - even for spatial…
Screenshot of Miles McBain’s r2vr demo
If you consider that this post by @mdsumner was written already two years ago, and R users potentially have been able to do stuff such as to “colour a WebVR mesh by shading using a raster layer” (cf. screenshot above; McBain 2018) with the r2vr package in R for more than half a year now, I think this quote by Michael D. Sumner just perfectly sums up what’s the status quo and the potential of “spatial / GIS with R”:
“GIS itself needs what we can already do in R, it’s not a target we are aspiring to[,] it’s the other way around.” (Sumner 2017)
I mean, this rayshader output (pictured below; click on the image to see the full GIF in action) by the package’s developer Tyler Morgan-Wall is not just a proof-of-concept. It’s out there on CRAN, it’s one technique already available for everyone with just a basic understanding of R and/or GIS.
Screenshot of Tyler Morgan-Wall’s rayshader demo. Click this link to see the GIF in action.
Or have a look at what Michael D. Sumner has been working on with the quadmesh package, which will allow us - among other things - to take raster to another level. Same for Edzer Pebesma’sstars package with regards to its potential for tidy spatiotemporal data.
In short, GIS with R is efficient, effective, performative, affordable, and Shiny shiny and complex enough to serve as the domain for showcasing R and as a exhaustive use case for new R users (or their supervisors).
Recently, I did another but far more extensive freelance gig involving custom mapping and real-world geodata, and thereby discovered so many really helpful / effective / efficient / cool packages, methods, tools and workflows that I decided to write another GIS/mapping post. It’s simply time to give something back to the community (and to create a useful reference entry for my future self).
In this post, we’re going to take a tour through several versatile spatial packages for R. As a use case, we’re going to visualize aggregated crime data (district-level; yearly) for my hometown Berlin, Germany.
This post is not about basic geocomputation / GIS / cartography concepts. Regarding this, there already are plenty of excellent up-to-date resources such as Lovelace/Novosad 2018, Jesse Adler’s solid series on R for Digital Humanities, or https://statnmap.com/2018-07-14-introduction-to-mapping-with-sf-and-co/.
Rather, this post is more like my own glossary for all the magical things / hacks.
If there’s a single really helpful thing to have present in mind for now, it’s this common basic representation of GIS’ data layers:
Source: National Coastal Data Development Centre (NCDDC), National Oceanic and Atmospheric Administration (NOAA), USA
library(tidyverse)
library(sf)
library(mapview)
library(mapedit)
library(rnaturalearth)
library(osmdata)
# library(raster) # I prefer to use raster::fun() since raster::select() masks dplyr::select()
Let’s assume that we want to make some kind of a map, be it because we actually need a map (i.e. for a great book on European foreign policy), or maybe just to give some spatial data some kind of a cognitive canvas. As geodata tends to take up non-trivial amounts of resources (bandwidth, memory, CPU), it could be useful to focus on only a particular geographic extent. This is where I learned to love the concept of a bounding box (BBOX). Instead of literally downloading the whole world (thereby straining someone else’s bandwidth / server capacity) and then running costly query/filter operations, we can easily define bounds first.
If you’re spatially literate and therefore easily can spot whether you need Lat/Lon or Lon/Lat, or what projection / CRS your favourite BBOX-providing workflow offers, you probably can skip this and just define your BBOX manually with something like bbox <- c(xmin, ymin, xmax, ymax). I tend to struggle with this (even when using web tools such as http://bboxfinder.com), because I either mix up Lat/Lon or underestimate the spatial extent of my geodata or whatever. This is why I think that these three approaches below might be helpful to others, too.
So first of all we want to decide on the extent of the basemap. Since this is a recurrent task but with varying parameters, I’m offering three different approaches, based on
Here, we start with our spatial object of reference. I prefer to work with Open Access / Public Domain data (and not Google Maps / Bing et al.), so let’s fetch some vector data from the Natural Earth project - but as class = Simple Feature instead of Esri’s proprietary (and less tidyverse-friendly) Shapefile format.
For your use case, consult the vignette for
rnaturaleath::ne_download()and the Natural Earth website. For the sake of resource efficiency I also recommend to download the data once and then to store it locally withsave/saveRDSfor future use. As I work a lot with Natural Earth data, I have mounted a network folderD:/GIS/with all the raster and vector data I’ve downloaded so far.However, I do not recommend download the “Download all X themes” file, since this gives you a) shapefiles instead of Simple Features, and b) UTF-8 encoding issues* for the feauture labels, depending on when the particular theme was parsed the last time.
substates10 <- rnaturalearth::ne_download(scale = 10, type = "admin_1_states_provinces", category = "cultural", returnclass = "sf")
# saveRDS(substates10, file = str_c(here::here("data", "GIS_workflows", "/"), "substates10.rds"))
# substates10 <- readRDS(file = str_c(here::here("data", "GIS_workflows"), "/", "substates10.rds"))
berlin_sf <- substates10 %>% filter(name == "Berlin")
Next, we’ll use the amazing sf package to fetch Berlin’s bounding box.
berlin_bbox <- sf::st_bbox(berlin_sf)
So… wanna have a quick glimpse at Berlin’s bounding box? Here’s all that it takes with the game-changing mapview package by Tim Appelhans et al.:
berlin_bbox %>% mapview::mapview()
Caveat: To keep memory and CPU load low for everyone browsing this post, I mostly will display the output of interactive widgets as
.jpg.
If you want to set the “CartoDB Dark Matter” basemap as your default (or any other of the themes supported by Leaflet ), you might want to add this setting to your
.Rprofile:
mapviewOptions(basemaps = c("CartoDB.DarkMatter", "CartoDB.Positron", "Esri.WorldImagery", "OpenStreetMap", "OpenTopoMap"))
And since mapview() is cool, we clan plot and inspect our Berlin object and the BBOX at the same time. For this we’re going to convert the BBOX object into a regular simple feature object with sf::st_as_sfc on the fly:
mapview::mapview(list(sf::st_as_sfc(berlin_bbox), berlin_sf))

This box would probably be too narrow for further static editing (i.e. for a print map), so it would be cool if we could simply increase the BBOX’s extent. Since the coordinates in the bbox are stored as a numeric vector c(xmin, ymin, xmax, ymax), we can easily expand the bbox by providing another vector of length = 4 and see if the extent is better on the fly:
(berlin_bbox + c(-0.1, -0.1, 0.1, 0.1)) %>%
# sf::st_as_sfc() %>%
mapview::mapview()

That seems generous enough. So let’s preserve this box for later.
berlin_bbox <- berlin_bbox + c(-0.1, -0.1, 0.1, 0.1)
Another approach to get a reasonable bounding box is to calculate the BBOX based on your geodata.
As mentioned above, I’m going to use the freshly released cycling data from the Berlin-based Tagespiegel DDJ / Innovation Lab for the final use case. The data has been collected as part of the sensor-based #radmesser project, where Lab leader Hendrik Lehmann’s team of journalists and developers equipped 100 (!) cyclists with close-range sensors to measure the distance of passing-by cars and trucks on Berlin’s roads. Goal: Demonstrate the at-risk status of cyclists in Germany’s capital.
Make sure to check out the award-winning project’s website and the project’s repo.
It’s probably worth the remark that getting the data into R is as simple as sf::st_read(URL)…
# License: ODC-By v1.0/Tagesspiegel Radmesser/https://radmesser.de
# cf. https://github.com/tagesspiegel/radmesser/blob/master/opendata/LICENSE.md
berlin_bike <- sf::st_read("https://github.com/tagesspiegel/radmesser/blob/master/opendata/detailnetz_ueberholvorgaenge.geo.json?raw=true")
# saveRDS(berlin_bike, file = str_c(here::here("data", "GIS_workflows", "/"), "berlin_bike.rds"))
# berlin_bike <- readRDS(file = str_c(here::here("data", "GIS_workflows"), "/", "berlin_bike.rds"))
However, st_read silently drops the “stats” field (which is legit, since a GeoJSON feature is defined as
geometry + propertiesonly) which contains the single measurements. Or rather: I have no 1-liner idea how to preserve it, be it in QGIS, withjsonlite, or by transforming with http://geojson.io. Fortunatly, the Radmesser-team also provides a CSV with the measurements and a column with the respective streets key. So we can adress this later (in Pt. 2 of this series).
Nonetheless, we can quickly have a look at the traced roads where the 15K+ measurements where taken, and also visualise the road class.
mapview::mapview(berlin_bike, zcol ="STRKLASSE1")

And now the data-based BBOX:
berlin_bike_bbox <- sf::st_bbox(berlin_bike)
mapview::mapview(list(st_as_sfc(berlin_bike_bbox), berlin_bike))